Skip to content

Conversation

mrotondo
Copy link
Contributor

use number of workers to control concurrency of mef jobs, and number of threads to control concurrency of webhook callback jobs

…of threads to control concurrency of webhook callback jobs

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am curious how you landed on 15 rather than 10 or 20?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The existing value was 5, and I figured that (if I want all connection attempts to succeed immediately) I'm adding 10 new possible threads that might need a DB connection (5 single-threaded workers for the MeF jobs, 1 5-threaded worker for the callbacks).

I'm not actually sure of myself here (maybe the webhook callback worker actually only needs one db connection, since the job doesn't use the db and therefore the only need is for the worker to claim & execute jobs?) but that was my reasoning, faulty or not.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you running the queue embedded in the puma server or are you running the jobs queue stand alone? If running stand alone, I don't think you should need to change this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also unsure though about how many threads are reasonable for postgres. I've never really deviated from the default.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re puma server: i would imagine stand alone since there would be additional configuration for puma

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

queue is puma plugin in development, standalone elsewhere - see

plugin :solid_queue if ENV["SOLID_QUEUE_IN_PUMA"] || Rails.env.development?

The actual error I get when running bin/rails s in development is:

Solid Queue is configured to use 7 threads but the database connection pool is 5. Increase it in `config/database.yml`

I assume the 7-thread requirement is coming from the 5-threaded webhook callback worker, per this note in the solid_queue docs:

It is recommended to set this value less than or equal to the queue database's connection pool size minus 2, as each worker thread uses one connection, and two additional connections are reserved for polling and heartbeat.

I'm going to resolve this by reducing the number of threads that the webhook callback worker uses to 3, rather than increasing the db connection pool, since I'm also uncertain about the impact of the latter option and the former should be relatively low-impact (since our webhook callback jobs shouldn't take long to complete, since I'm also adding a 5-second timeout to all webhook callback requests, down from the default of 60(!))

@anisharamnani
Copy link
Contributor

what is the way to go about testing this?

… to remain at default, and add a timeout of 5 seconds to webhook callback requests
@mrotondo
Copy link
Contributor Author

what is the way to go about testing this?

I've tested it manually, and just did so again on video! Let me know if you can get to this @anisharamnani @powersurge360 https://codeforamerica.zoom.us/clips/share/7aa-RyQ8Szy6-WgN3N-9Tg

@mrotondo
Copy link
Contributor Author

what is the way to go about testing this?

I've tested it manually, and just did so again on video! Let me know if you can get to this @anisharamnani @powersurge360 https://codeforamerica.zoom.us/clips/share/7aa-RyQ8Szy6-WgN3N-9Tg

@anisharamnani @powersurge360

It turns out reality is not so simple - I just updated the MeF SDK so that we can make actual requests to ATS again and ran this same test, and got several Session limit reached errors. So I think that this PR is insufficient to keep us under our limit and needs some changes. I'll bump it in Slack when it's ready for another look!

Copy link
Contributor

@anisharamnani anisharamnani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks mike for your thorough testing video! greatly appreciated :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants